Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Jun 9, 2020

What changes were proposed in this pull request?

This PR intends to add a build-in SQL function - WIDTH_BUCKET.
It is the rework of #18323.

Closes #18323

The other RDBMS references for WIDTH_BUCKET:

Why are the changes needed?

For better usability.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added unit tests.

@SparkQA
Copy link

SparkQA commented Jun 9, 2020

Test build #123675 has finished for PR 28764 at commit 668f419.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 9, 2020

Test build #123673 has finished for PR 28764 at commit 6323713.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jun 9, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jun 9, 2020

Test build #123676 has finished for PR 28764 at commit 668f419.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jun 9, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jun 9, 2020

Test build #123694 has finished for PR 28764 at commit 668f419.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jun 9, 2020

Could you review this? @HyukjinKwon @viirya @wangyum @yaooqinn

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay overall. cc @cloud-fan @dongjoon-hyun

@dongjoon-hyun
Copy link
Member

Thank you for pinging me, @viirya .

@dongjoon-hyun
Copy link
Member

+1, LGTM (only minor comments). Thanks, @maropu .

@maropu
Copy link
Member Author

maropu commented Jun 11, 2020

Thanks for your kind reviews, @dongjoon-hyun !

@SparkQA
Copy link

SparkQA commented Jun 11, 2020

Test build #123808 has finished for PR 28764 at commit 9e8eab2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jun 11, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jun 11, 2020

Test build #123795 has finished for PR 28764 at commit cace933.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 11, 2020

Test build #123823 has finished for PR 28764 at commit 9e8eab2.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jun 11, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jun 11, 2020

Test build #123829 has finished for PR 28764 at commit 9e8eab2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Thank you, @maropu and all.
Merged to master.

@maropu
Copy link
Member Author

maropu commented Jun 11, 2020

Thanks a lot, @dongjoon-hyun, @viirya, and @yaooqinn !

@HyukjinKwon
Copy link
Member

late LGTM!

| org.apache.spark.sql.catalyst.expressions.Uuid | uuid | SELECT uuid() | struct<uuid():string> |
| org.apache.spark.sql.catalyst.expressions.WeekDay | weekday | SELECT weekday('2009-07-30') | struct<weekday(CAST(2009-07-30 AS DATE)):int> |
| org.apache.spark.sql.catalyst.expressions.WeekOfYear | weekofyear | SELECT weekofyear('2008-02-20') | struct<weekofyear(CAST(2008-02-20 AS DATE)):int> |
| org.apache.spark.sql.catalyst.expressions.WidthBucket | width_bucket | SELECT width_bucket(5.3, 0.2, 10.6, 5) | struct<widthbucket(CAST(5.3 AS DOUBLE), CAST(0.2 AS DOUBLE), CAST(10.6 AS DOUBLE), CAST(5 AS BIGINT)):bigint> |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output schema is widthbucket? It should be width_bucket if we define the prettyName to width_bucket?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, it looks better. I'll fix it later.

override def children: Seq[Expression] = Seq(value, minValue, maxValue, numBucket)
override def inputTypes: Seq[AbstractDataType] = Seq(DoubleType, DoubleType, DoubleType, LongType)
override def dataType: DataType = LongType
override def nullable: Boolean = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

override def prettyName: String = "width_bucket" ?

dongjoon-hyun pushed a commit that referenced this pull request Jul 22, 2020
### What changes were proposed in this pull request?

This PR is to define prettyName for `WidthBucket`.
This comes from the gatorsmile's suggestion: #28764 (comment)

### Why are the changes needed?

For a better name.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #29183 from maropu/SPARK-21117-FOLLOWUP.

Authored-by: Takeshi Yamamuro <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants